Electronic Transport in DNA 
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We study the electronic properties of DNA by way of a tight-binding model applied to four 
particular DNA sequences. The charge transfer properties are presented in terms of localisation 
lengths, crudely speaking the length over which electrons travel. Various types of disorder, including 
random potentials, are employed to account for different real environments. We have performed 
calculations on poly(dG)-poly(dC), telomeric-DNA, random- ATGC DNA and A-DNA. We find that 
random and A-DNA have localisation lengths allowing for electron motion among a few dozen base 
pairs only. A novel enhancement of localisation lengths is observed at particular energies for an 
increasing binary backbone disorder. We comment on the possible biological relevance of sequence 
dependent charge transfer in DNA. 

PACS numbers: 72.15.Rn, 87.15. Cc, 73.63.-b 



I. INTRODUCTION 

The question of whether DNA conducts electric 
charges is intriguing to physicists and biologists alike. 
The suggestion that electron transfer/transport in DNA 
might be biologically important has triggered a series of 
exp erimental and theoretical investigations [q UItI I20tl3ll 
I35L I54IJ . Processes that possibly use electron transfer in- 
clude the function of DNA damage response enzymes, 
transcription factors or polymerase co-factors all of which 
play important roles in the cell . Indeed there is direct 
evidence |9j that MutY — a DNA base excision repair en- 
zyme with an [4Fe4S] + cluster of undetermined function 
— takes part in some kind of electron transfer as part of 
the DNA repair process [3(| ^(| . This seems consistent 
with studies in which an electric current is passed through 
DNA revealing that damaged regions have significantly 
different electronic behaviour than healthy ones |(| ■ 

For physicists, the continuing progress of nanotech- 
nologies and the consequent need for further size minia- 
turisation makes the DNA molecule an excellent candi- 
date for molecular electronics 0, 0, |23l 0] . DNA might 
serve as a wire, transistor, switch or rectifier depending 
on its electronic properties 0, 1^3, • 

In its natural environment, DNA is always in liq- 
uid solution and therefore experimentally one can study 
the molecule either in solution or in artificially imposed 
dry environments. In solution experiments DNA can 
be chemically processed to host a donor and an ac- 
ceptor molecule at different sites along its long axis. 
Photo-induced charge transfer rates can then be mea- 
sured whilst the donor/acceptor molecules, the distance 
and the sequence of DNA that lies between them are var- 
ied. The reactions are observed to depend on the type of 
DNA used, the intercalation, the integrity of the interven- 
ing base pair stack and, albeit weakly, on the molecular 
distance EHEElH- 

Direct conductivity measurements on dry DNA have 
also been preformed in the past few years. The remark- 
able diversity that characterises the results seems to arise 



from the fact that many factors need to be experimen- 
tally controlled. These include methods for DNA align- 
ment and drying, the nature of the devices used to mea- 
sure the conductivity, the type of metallic contacts and 
the sequence and length of the DNA. DNA has been 
reported to be an insulator ^(j, an ohmic conductor 
U El IH 0IH and a semiconductor 0. Theoreti- 
cally, single-step super exchange |3l| and multi-step hop- 
ping || models have provided interpretations of solution 
experiments. For experiments in dry DNA, several ad- 
ditional approaches such as variable range hopping |57j| . 
one-dimensional quan tum mechanical tight-binding mod- 

els mmHHlliiHI and non - linear methods HlHl 
have also been proposed. 

Despite the lack of a consistent picture for the elec- 
tronic properties of DNA, one conclusion has been es- 
tablished: the environment of the DNA impacts upon 
its structural, chemical and thus probably also electronic 
properties. Both theoretical and experimental studies 
show that the temperature and the type of solution sur- 
rounding DNA have a significant effect on its structure 
and shape 0, [H E3 . The effect of the environment is a 
key one to this report, where the environmental fluctua- 
tions are explicitly modelled as providing different types 
of disorder. 

In this work, we focus on whether DNA, when treated 
as a quantum wire in the fully coherent low-temperature 
regime, is conducting or not. To this end, we study and 
generalise a tight-binding model of DNA which has been 
shown to reproduce experimental jl3j as well as ab-initio 
results fH} ■ A main feature of the model is the presence 
of sites which represent the sugar-phosphate backbone of 
DNA but along which no electron transport is permissi- 
ble. We measure the "strength" of the electronic trans- 
port by the localisation length £, which roughly speaking 
parametrises whether an electron is confined to a certain 
region £ of the DNA (insulating behaviour) or can pro- 
ceed across the full length L (< £) of the DNA molecule 
(metallic behaviour). 

Sections ITTllIIIIintroduce our models and the numerical 



5' end 




3' end 



5' end 



FIG. 1: The chemical composition of DNA with the four 
bases Adenine, Thymine, Cytosine, Guanine and the back- 
bone. The backbone is made of phosphorylated sugars shown 
in yellow and brown. 



approach. In section E] we show that DNA sequences 
with different arrangement of nucleotide bases Adenine 
(A), Cytosine (C), Guanine (G) and Thymine (T) exhibit 
different £'s when measured, e.g. as function of the Fermi 
energy E. The influence of external disorder, modelling 
variants in the solution, bending of the DNA molecule, 
finite-temperature effects, etc., is studied in section IVTI 
where we show that, surprisingly, the models support an 
increase of £ when disorder is increased. We explain that 
this effect is linked to the existence of the backbone sites. 



II. TIGHT-BINDING MODELS FOR DNA 
WITH A GAP IN THE SPECTRUM 

A. The Fishbone model 

DNA is a macro-molecule consisting of repeated stacks 
of bases formed by either AT (TA) or GC (CG) pairs 
coupled via hydrogen bonds and held in the double- 
helix structure by a sugar-phosphate backbone. In Fig. 
n we show a schematic drawing. In most models of 
electronic transport 0, [6(| it has been assumed that 
the transmission channels are along the long axis of 
the DNA molecule [(3lJ and that the conduction path 
is due to 7r-orbital overlap between consecutive bases 
|52| ; density- functional calculations [37| have shown that 
the bases, especially Guanine, are rich in 7r-orbitals. 
Quantum mechanical approaches to the problem mostly 
use strictly one-dimensional (ID) tight-binding models 

S3 mil mil. 

Of particular interest to us is a quasi-lD model 01 
which includes the backbone structure of DNA explicitly 
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FIG. 2: The fishbone model for electronic transport along 
DNA corresponding to the Hamiltonian given in Eq. 
Lines denote hopping amplitudes and circles give the central 
(grey) and backbone (open) sites. 



and exhibits a semiconducting gap. This fishbone model, 
shown in Fig. |2 has one central conduction channel in 
which individual sites represent a base-pair; these are in- 
terconnected and further linked to upper and lower sites, 
representing the backbone, but are not interconnected 
along the backbone. Every link between sites implies the 
presence of a hopping amplitude. The Hamiltonian for 
the fishbone model (Hp) is given by: 

L 

^ = EE (-u\i)(i+i\-n\i, q )(i\ 

»=i s=T4 

+e i \i){i\+s g i \i,q)(i,q\) + h.c. (1) 

where ti is the hopping between nearest-neighbour sites 
i, i + 1 along the central branch, t\ with q =|, J, gives the 
hopping from each site on the central branch to the upper 
and lower backbone respectively. Additionally, we denote 
the onsite energy at each site along the central branch 
by Si and the onsite energy at the sites of the upper and 
lower backbone is given by sf, with q =TI- L is the num- 
ber of sites/bases in the sequence. The model Q clearly 
represents a dramatic simplification of DNA. Neverthe- 
less, in Ref. [l3| it had been shown that this model when 
applied to an artificial sequence of repeated GC base 
pairs, poly(dG)-poly(dC) DNA, reproduces experimen- 
tal data current- voltage measurements when ti — 0.37eV 
and tf = 0.74eV are being used. Therefore, we will as- 
sume t\ = 2ti and set the energy scale by t j = 1 for hop- 
ping between GC pairs. In what follows we will adopt 
energy units in which eV — 1 throughout. 

For natural DNA sequences, we need to know how the 
hopping amplitudes vary as the electron moves between 
like pairs, i.e. from GC to GC or from AT to AT, and 
unlike pairs, i.e., from GC to AT and vice versa. We 
choose U — 1 between identical and matching bases (e.g. 
AT/TA, GC/CG). Assuming that the wavefunction over- 
lap between consecutive bases along the DNA strand is 
weaker between unlike and non-matching bases (AT/GC, 
TA/GC, etc.) we thus choose 1/2. 
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FIG. 3: The ladder model for electronic transport along 
DNA. The model corresponds to the Hamiltonian @. 



B. The Ladder model 

We performed semi-empirical calculations on DNA 
base pairs and stacks using, the SPARTAN quantum 
chemistry software package []J. The results have shown 
that the relevant electronic states of DNA (highest- 
occupied and lowest-unoccupied molecular orbitals with 
and without an additional electron) are localised on one 
of the bases of a pair only. The reduction of the DNA 
base-pair architecture into a single site per pair, as in 
the fishbone model QJ, is obviously a highly simplified 
approach. As an improvement on this we model each 
base as a distinct site where the base pair is then weakly 
coupled by the hydrogen bonds. The resulting 2-channel 
model is shown in Fig. [21 This ladder model is a planar 
projection of the structure of the DNA with its double- 
helix unwound. We note that results for electron trans- 
fer also suggest that the transfer proceeds preferentially 
down one strand |25|. There are two central branches, 
linked with one another, with interconnected sites where 
each represents a complete base and which are addition- 
ally linked to the upper and lower backbone sites. The 
backbone sites as in the fishbone model are not intercon- 
nected. The Hamiltonian for the ladder model is given 
by 



2J (t iiT \i,r){i + l,T\ +e ilT \i,T)(i,r\ 



+ £ (t!\i,T){i,q(T)\+e1\i,q){i, q \) 

0=1, 1 

+ ii )2 |i,l)(i,2| +h.c. 



(2) 



where ti tT is the hopping amplitude between sites along 
each branch r = 1, 2 and Ei tT is the corresponding onsite 
potential energy. t\ and and e\ as before give hopping 
amplitudes and onsite energies at the backbone sites. 



Also, q(r) =|, J, for r = 1,2, respectively. The new 
parameter t±2 represents the hopping between the two 
central branches, i.e., perpendicular to the direction of 
conduction. SPARTAN results suggest that this value, 
dominated by the wave function overlap across the hy- 
drogen bonds, is weak and so we choose ii2 = 1/10. 



C. Including disorder 

In order to study the transport properties of DNA, we 
could now either use artificial DNA (pqly(dG)-poly(dC) 
|43|. random sequences of A.T.G.C |3 8ll56 | . etc.) or nat- 
ural DNA (bacteriophage A-DNA |37|. etc.1. The biologi- 
cal content of the sequence would then simply be encoded 
in a specific sequence of hopping amplitudes 1 and 1/2 
between like and unlike base-pair sequences. However, in 
vivo and most experimental situations, DNA is exposed 
to diverse environments and its properties, particularly 
those related to its conformation, can change drastically 
depending on the specific choice. The solution, thermal 
effects, presence of binding and packaging proteins and 
the available space are factors that alter the structure 
and therefore the properties that one is measuring |3.l57|. 
Clearly, such dramatic changes should also be reflected 
in the electronic transport characteristics. Since it is pre- 
cisely the backbone that will be most susceptible to such 
influences, we model such environmental fluctuations by 
including variations in the onsite potentials £j jq . 

Different experimental situations will result in a dif- 
ferent modification of the backbone electronic structure, 
and we model this by choosing different distribution func- 
tions for the onsite potentials, ranging from uniform dis- 
order Ei. q £ [— W/2,W/2], to Gaussian disorder and on 
to binary disorder Ei. q = ±W/2. W is a measure for the 
strength of the disorder in all cases. Particularly the bi- 
nary disorder model can be justified by the localisation of 
ions or other solutes at random positions along the DNA 
strand 0. 



D. Effective models and the energy gap 

Due to the non-connectedness of the backbone sites 
along the DNA strands, the models and J2Jl can be 
further simplified to yield models in which the backbone 
sites are incorporated into the electronic structure of the 
DNA. The effective fishbone model is then given by 



£ 

<z=T.J. 



m\ 



(3) 



4 



Similarly, the effective ladder model reads as 



^ 1 , 2 |i,l}<Ml+ J2 U,r\i,r){i+1,T\ 



.9(t) 



|«,r)(«,r| 



/i.e. 



(4) 



In these two models, the backbone has been incorporated 
into an energy-dependent onsite potential on the main 
DNA sites. This re-emphasises that the presence of the 
backbone influences the local electronic structure on the 
DNA bases and similarly, any variation in the backbone 

T I 

disorder potentials e\ will results in a variation of effec- 
tive onsite potentials as given in the brackets of Eqs. © 
and 0. 

Both models allow to quickly calculate the gap of 
the completely ordered system (all onsite potentials 
zero) by assuming that the lowest-energy state ip = 
Si ^i(,r)K(; r )) m each band corresponds to constant ipi 
(ipi lT ) whereas for the highest-energy states, a checker- 
board pattern is obtained with ipi = ipi+i i^i,r — 
— tpi+i.r, ''Pi,! — —^1,2)- For the fishbone model, this 
shows that, e.g. E mintT 



and 



E 



max,=p 



i.T 



For the chosen set of 



hopping parameters for © and (@J, this gives £? m i n ,=F = 
—4,2 and fi max . T = —2,4 for the fishbone model and 
£min, T « -3.31,1.21 and E max , T = -1.21,3.31 for the 
ladder model. 



III. THE NUMERICAL APPROACH AND 
LOCALISATION 

There are several approaches suitable for studying the 
transport properties of the models an d © and these 
can be found in the literature on transport in solid state 
devices, or, perhaps more appropriately, quantum wires. 
Since the variation in the sequence of base pairs precludes 
a general solution, we will use two methods well-known 
from the theory of disordered systems ■ 

The first method is the iterative transfer-matrix 
method (TMM) [H |H HJ HJ which allows us in 
principle to determine the localisation length £ of elec- 
tronic states in systems with cross sections M = 1 (fish- 
bone) and 2 (ladder) and length L 3> M, where typically 
a few million sites are needed for L to achieve reason- 
able accuracy for £. However, in the present situation 
we are interested in finding £ also for viral DNA strands 
of typically only a few ten thousand base-pair long se- 
quences. Thus in order to restore the required precision, 
we have modified the conventional TMM and now per- 
form the TMM on a system of fixed length Ln. This 
modification has been previously used [22|, Hj|, and 



may be summarised as follows: After the usual forward 
calculation with a global transfer matrix 7/, , we add 

a backward calculation with transfer matrix T? . This 



forward-backward-multiplication procedure is repeated 
K times. The effective total number of TMM multiplica- 
tions is L = 2KLq and the global transfer-matrix is tl = 

{T~l q Tl^) K ■ It can be diagonalised as for the standard 

TMM with K -> oo to give t\t l -> exp[diag(4ivTL /Cr)] 
with r = 1 or t = 1, 2 for fishbone and ladder model, re- 
spectively. The largest £ T Vr then corresponds to the lo- 
calisation lengths of the electron on the DNA strand and 
will be measured in units of the DNA base-pair spacing 
(0.34 ran). 

The second method that we will use is the recur- 
sive Green function approach pioneered by MacKinnon 
[2?l |28| . It can be used to calculate the dc and ac con- 
ductivity tensors and the density of states (DOS) of a 
rf-dimensional disordered system and has been adopted 
to calculate all kinetic linear-transport coefficients such 
as thermoelectric power, thermal conductivity, Peltier co- 
efficient and Lorentz number [5 1| . 

The main advantage of both methods is that they work 
reliably (i) for short DNA strands ranging from 13 (DFT 
studies |37|) base pairs up to 30 base pairs length which 
are being used in the nanoscopic transport measurements 
(lEf as well as (ii) for somewhat longer DNA sequences as 
modelled in the electron transfer results and (iii) even for 
complete DNA sequences which contain, e.g. for human 
chromosomes up to 245 million base pairs Q- 



IV. DNA SEQUENCES 

The exact arrangement of the four bases A, T, G, C de- 
termines the nature and function of its associated DNA 
strand such as the chemical composition of the proteins 
which are encoded. While previous studies have aimed 
to elucidate whether DNA conducts at all, we shall also 
focus our attention to investigate how different DNA se- 
quences, be they artificial or naturally occurring, "con- 
duct" charge differently. Thus we study a set of different 
DNA. 

A convenient starting point for most electronic 
transport studies 0] is the aforementioned poly(dG)- 
poly(dC) sequence, which corresponds to a simple repeti- 
tion of a GC (or CG) pair. Note that within our models, 
there is no difference between GC and CG pairs. Al- 
though not occurring naturally, such sequences can be 
synthesised easily. Another convenient choice of artifi- 
cial DNA strand is a simple random sequence of the four 
bases, which we construct with equal probability for all 
4 bases. However, they are not normally used in experi- 
ments. 

As DNA samples existing in living organisms, 
we shall use A-DNA of the bacteriophage virus 
[Bacteriophage lambdaj | which has a sequence of 48502 
base pairs. It corresponds to a bacterial virus and is bio- 
logically very well characterised. We also investigate the 



29728 bases of the SARS virus jSARSj . Telomeric DNA 
is a particular buffer part at the beginning and ends of 
of DNA strands for eukaryote cells Q • In mammals it is 
a Guanine rich sequence in which the pattern TTAGGG 
is repeated over thousands of bases. Its length is known 
to vary widely between species and individuals but we 
assume a length of 6000 base-pairs. Last, we show some 
studies of centromeric DNA f or chromosome 2 of yeast 
with 813138 base pairs |CEN2j . This DNA is also report- 
edly rich in G bases and has a high rate of repetitions 
which should be favourable for electronic transport. 

Initially, we will compute transport properties for com- 
plete DNA sequences, i.e. including and not differentiat- 
ing between coding and non-coding sequences (this dis- 
tinction applies to the naturally occurring DNA strands 
only). However, we will later also study the difference 
between those two different parts of a given DNA. We 
emphasise that while non-coding DNA suffers from the 
label of "junk" , it is now known toplay several important 
roles in the functioning of DNA Q • 

Before leaving the description of our DNA sequences, 
we note that occasionally, we show results for "scram- 
bled" DNA. This is DNA with the same number of A, 
T, C, G bases, but with their order randomised. Clearly, 
such sequences contain the same set of electronic poten- 
tials and hopping variations, but would perform quite 
differently if released into the wild. A comparison of 
their transport properties with those from the original 
sequence thus allows to measure how important the ex- 
act fidelity of a sequence is. 



V. RESULTS FOR CLEAN DNA 

Let us start by studying the localisation properties of 
DNA without any onsite disorder either at £j jT or at 
£i tq . For a poly(dG)-poly(dC) sequence, both fishbone 
and ladder model produce two separate energy bands be- 
tween the extremal values computed at the end of section 
III Dl Within these energy bands, the electronic states are 
extended with infinite localisation length £ as expected. 
Outside the bands, transport is exponentially damped 
due to an absence of states and the £ values are very 
close the zero. In Fig.0]the resulting inverse localisation 
lengths are shown. These are zero for the extended states 
in the two bands, but finite outside, showing the quick 
decrease of the localisation lengths outside the bands. In 
Fig. El we show the same data but now plot the localisa- 
tion length itself. We see that the energy gap observed 
previously 13] for the poly(dG)-poly(dC) sequence in the 
fishbone model remains. The difference with respect to 
the ladder model is a slight renormalisation of the gap 
width. The localisation lengths of poly(dG)-poly(dC) 
DNA tend to infinity, meaning that the sequence is per- 
fectly conducting. This is expected due to its periodic 
electronic structure. 

Turning our attention to the other three DNA se- 
quences, we find that telomeric DNA also gives rise to 




FIG. 4: Plot of the inverse localisation lengths £ as a func- 
tion of Fermi energy for the ladder model and four DNA 
sequences as well as for the fishbone model with a poly(dG)- 
poly(dC) sequence. The data for telomeric DNA has been 
shaded for clarity. Lines are guides to the eye only. 
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FIG. 5: Localisation lengths as a function of energy for 
poly(dG)-poly(dC), telomeric, random- ATGC, and A-DNA as 
described in the text. The spectrum is symmetric in energy. 
The data for telomeric DNA has been shaded for clarity. Lines 
are guides to the eye only. 

perfect conductivity like poly(dG)-poly(dC) DNA. But 
due to its structure of just 6 repeating base pairs, there 
is a further split of each band into 3 separate sub-bands. 
They may be calculated like in section III Dl We would 
like to point out that it may therefore be advantageous 
to use the naturally occurring telomeric parts of DNA 
sequences as prime, in- vivo candidates when looking for 
good conductivity in a DNA strand. 

The structure of the energy dependence for the 
random-ATGC and the A-DNA is very different from 
the preceding two sequences, but it is quite similar be- 
tween just these two. The biological content of the DNA 
sequences is — within the description by our quantum 
models — just a sequence of binary hopping elements 
between like and unlike base pairs. Thus the models are 
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related to the physics of random hopping models 0, E3| 
and in agreement with these, we see a Dyson peak [l8( 
in the centre of each sub-band. Furthermore, we see that 
the range of energies for which we observe non-zero local- 
isation lengths is increased into the gap and for large ab- 
solute values of the energy. This is similar to the broad- 
ening of the sing le energy band for the Anderson model of 
localisation [50(. The localisation lengths, which roughly 
equal the average distance an electron would be able to 
travel (conduct), are close to the distance of 20 bases 
within the band, with a maximum of ~ 30 bases at the 
centre of each band. Note that this result is surprisingly 
good — given the level of abstraction used in the present 
models — when compared to the typical distances over 
which electron transfer processes have been shown to be 
relevant [1 ll7L l3ll Ejl 15 1 lEl f . 



VI. RESULTS FOR DISORDERED DNA 
A. DNA randomly bent or at finite temperatures 

As argued before, environmental influences on the 
transport properties of DNA are likely to influence pre- 
dominantly the electronic structure of the backbone. 
Within our models, this can be captured by adding a 
suitable randomness onto the backbone onsite potentials 
e\. In this fashion, we can model for example the influ- 
ence of a finite-temperature ^lj an( i thus a coupling to 
phonons j24[. We emphasise however, that in order for 
our localisation results — which rely on quantum me- 
chanical interference effects — to remain valid, the phase 
breaking lengths should stay much larger than the se- 
quence lengths. Thus the permissible temperature range 
is a few K only. The bending of DNA is another possi- 
bility which can be modelled by a local, perhaps regular, 
change in e\ along the strand. Another important aspect 
is the change in £\ due to the presence of a solution in 
which DNA is normally immersed. 

All these effects can be modelled in a first attempt by 
choosing an appropriate distribution function P(e|). Let 
us first choose uniform disorder with e\ G [— W/2, W/2]. 
In Fig. we show the results for all 4 DNA sequences 
as a function of energy for W = 1. Comparing this to 
Fig. [3 we see that now all localisation lengths are fi- 
nite; poly(dG)-poly(dC) and telomeric DNA having lo- 
calisation lengths of a few hundreds and a few tens of 
bases, respectively. The localisation lengths for random- 
ATGC and A-DNA are only slightly reduced. In all cases, 
the structure of 2 energy bands remains. Furthermore, 
W = I already represents a sizable broadening of about 
1/2 the width of each band. Thus although the locali- 
sation lengths are finite compared to the results of sec- 
tion they are still larger than the lengths of the DNA 
strands used in the nano-electric experiments, implying 
finite conductances. We remark that the Dyson peaks 
have vanished as expected [nj. We plot the DOS for 
A-DNA in Fig. which clearly indicates the 2 bands. 



300 



200 



o o poly(dG)-poly(dC) DNA^ 
- telomeric DNA 
random-ATGC DNA 
x— x J.-DNA 



100 




-1 1 
Energy 

FIG. 6: Top: Energy dependence of the localisation lengths, 
£(E), for poly(dG)-poly(dC), telomeric, random-ATGC and 
A-DNA in the presence of uniform backbone disorder with 
W — 1. Only every 2nd and 5th symbol is shown for random- 
ATGC and A-DNA, respectively. Bottom: DOS for A-DNA 
using the same parameters as in the top panel. 
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FIG. 7: Top: £(E) as in Fig. © but with W = 2. Only 
every 2nd and 5th symbol is shown for random-ATGC and A- 
DNA, respectively. Bottom: DOS for A-DNA using the same 
parameters as in the top panel. 



Upon further increasing the disorder to W = 2, as shown 
in Fig. the localisation lengths continue to decrease. 
Note that we observe a slight broadening of the bands 
and states begin to shift into the gap. We also see that 
the behaviour of random-ATGC and A-DNA is quite sim- 
ilar and at these disorder strengths, even telomeric DNA 
follows the same trends. At W = 5, the localisation 
lengths have been reduced to a few base-pair separation 
distances and the differences between all 4 sequences are 
very small. The gap has been nearly completely filled as 
shown by the DOS, albeit with states which have a very 
small localisation length. This will become important 
later. 

Thus, in summary, we have seen that adding uniform 
disorder onto the backbone leads to a reduction of the 
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FIG. 8: Top: as in Fig. ffl but with W = 5. Only 

every 2nd and 5th symbol is shown for random- ATGC and A- 
DNA, respectively. Bottom: DOS for A-DNA using the same 
parameters as in the top panel. 



localisation lengths and consequently a reduction of the 
electron conductance. Strictly speaking, all 4 strands 
are insulators. However, their localisation lengths can 
remain quite large, larger than in many of the experi- 
ments. Thus even the localised electron can contribute 
towards a finite conductivity for these short sequences. In 
agreement with experiments, poly(dG)-poly(dC) DNA is 
the most prominent candidate. 



B. DNA in an ionic solution 

When in solution, the negatively charged oxygen on the 
backbone will attract cations such as Na + . This will give 
rise to a dramatic change in local electronic properties at 
the oxygen-carrying backbone site, but not necessarily 
influence the neighbouring sites. The effects at each such 
site will be the same and thus in contrast to a uniform 
disorder used in section IVl Al a binary distribution such 
as £i_ q = ±W/2 is more appropriate. For simplicity we 
choose 50% of all backbone sites to be occupied e^ q — 
—W/2 while the other half remains empty with £j ;9 = 
+W/2. We note that a mixture of concentrations has 
been studied in the context of the Anderson model in 
Ref. S3. 

In Fig. we show the results for moderate binary dis- 
order. In comparison with the uniformly disordered case 
of Fig. we see that the localisation lengths have de- 
creased further. This is expected because binary disor- 
der is known to be very strong Also, the gap has 
already started to fill. 

Increasing the disorder leads again to a decrease of £ in 
the energy regions corresponding to the bands. Directly 
at E = ±W/2, we observe 2 strong peaks in the DOS 
which is accompanied by reduced localization lengths. 
This peak corresponds to the infinite potential barrier or 
well at E = —W/2 or +W/2, respectively, as indicated 
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FIG. 9: Top: Energy dependence of the localisation lengths, 
£,(E), for poly(dG)-poly(dC), telomeric, random- ATGC and 
A-DNA in the presence of binary backbone disorder with W = 
1. Only every 2nd and 5th symbol is shown for random- ATGC 
and A-DNA, respectively. Bottom: DOS for A-DNA using the 
same parameters as in the top panel. 
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FIG. 10: Top: as in Fig. M but with W = 2. Only 

every 2nd and 5th symbol is shown for random- ATGC and A- 
DNA, respectively. Bottom: DOS for A-DNA using the same 
parameters as in the top panel. 



by Eq. In Fig. [§1 these peaks were not yet visible. 
We also see in Fig. that the localisation lengths for 
states in the band centre start to increase to values > 1. 
This trend continues for larger W as shown in Fig. 1111 
We see a crossover into a regime where the two original, 
weak-disorder bands have nearly vanished and states in 
the centre at E = are starting to show an increasing 
localisation length upon increasing the binary disorder. 
A further increase in W eventually leads to the complete 
destruction of the original bands and the formation of a 
single band symmetric around E — at about W ~ 2.5. 
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FIG. 11: Top: £(E) as in Fig. El but with W = 5. Only 
every 2nd and 5th symbol is shown for random- ATGC and A- 
DNA, respectively. Bottom: DOS for A-DNA using the same 
parameters as in the top panel. 
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FIG. 12: Disorder dependence of £ for poly(dG)-poly(dC), 
telomeric, random-ATGC and A-DNA at E = 0. Only every 
10th symbol is shown for all sequences. The shaded curve is 
the corresponding unnormalized DOS for A-DNA. 



C. Delocalisation due to disorder 

The results of the previous section suggest that increas- 
ing the disorder in different regions of the energy will lead 
to different transport behaviour. Of particular interest is 
the region at E = 0. In Fig. ^] the variation of £ as a 
function of binary disorder strength for all different se- 
quences is shown. While £ < 1 for small disorder, we 
see that upon increasing the disorder, states begin to ap- 
pear and their localisation lengths increase for all DNA 
sequences. Thus we indeed observe a counter-intuitive 
delocalisation by disorder at E = 0. As before, poly(dG)- 
poly(dC) and telomeric disorder show the largest locali- 
sation lengths, whereas random-ATGC and A-DNA give 
rise to a smaller and nearly identical effect. In Fig. 1131 we 
show that this effect does not exist at E = 3, i.e. for en- 
ergies corresponding to the formerly largest localisation 
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Binary Backbone Onsite Disorder 

FIG. 13: as in Fig.[Hbut with E = 3. Only every 10th 

symbol is shown for all DNA sequences. The shaded curve is 
the corresponding unnormalized DOS for A-DNA. 




Uniform Backbone Onsite Disorder 



FIG. 14: £(W) as in Fig. ^| but with uniform disorder at 
E = and for the fishbone model. Only every 10th symbol 
is shown for all DNA sequences. The shaded curve is the 
corresponding unnormalized DOS for A-DNA. 



lengths. Rather, at E = 3, the localisation lengths for 
all DNA sequences quickly drop to £ ~ 1. The delocali- 
sation effect is also observed for uniform disorder, but is 
much smaller. As shown in Fig. the enhancement is 
up to about £ = 1 for the fishbone model QJ. Results for 
the ladder model J5J are about 1.7 times larger. 

This surprising delocalisation-by-disorder behaviour 
can be understood by considering the effects of disor- 
der at the backbone for the effective Hamiltonians © 
and @. At E = 0, the onsite potential correction term 
(t|) /(e? — E) will decrease upon increasing the ef val- 
ues. For binary disorders = ±W/2, this holds for 
\ef\ > \E\ as shown in Fig. ^5] However, for large \E\, 
the localisation lengths decrease quickly due to the much 
smaller density of states. Thus the net effect is an even- 
tual decrease (or an only very small increase) of £ for large 
E. Note the dip at |ef | = E = 3 in the figure, which cor- 
responds to the effective Si = oo, i.e. an infinitely strong 
trap yielding extremely strong localisation. For uniform 
disorder e\ £ [—W/2,W/2] — and generally any disor- 
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der with compact support around E = — the above 
inequality is never full-filled and even for E — we will 
find small ef ~ such that we have strong trapping and 
localisation. 



VII. INVESTIGATING THE LOCAL 
PROPERTIES OF THE SEQUENCES 

A. Variation of £ along the DNA strand 

In the preceding sections, we had computed estimates 
of the localisation length £ for complete DNA strands, 
i.e. the £ values are averages. However, the biological 
function of DNA clearly depends on the local structure 
of the sequence in a paramount way. After all, only cer- 
tain parts of DNA code for proteins, while others do not. 
In addition, the exact sequence of the bases specifies the 
protein that is to be assembled. Thus, in order to gain ac- 
cess to the local properties, we have performed computa- 
tions of £ on subsequences of complete DNA strands. We 
start by artificially restricting ourselves to finite windows 
of length K = 10, 30, 50, 100, 200, 500, 1000 and compute 
the localisation lengths ^k(t) where r = l,2,...,L — K 
denotes the starting position of the window of length K . 

In order to see how the exact sequence determines our 
results, we have also randomly permuted (scrambled) the 
A-DNA sequence so that the content of A, T, G, and C 
bases is the same, but their order is randomised. Differ- 
ences in the localisation properties should then indicate 
the importance of the exact order. From the biologi- 
cal information available on bacteriophage A-DNA, we 
compute the localisation length for the coding regions 
[l4j and then for window lengths K that correspond ex- 
actly to the length of each coding region. Again, if the 
electronic properties — as measured by the localisation 
length — are linked to biological content, we would ex- 
pect to see characteristic differences. 

In Figs. 1151 and 1161 we show results for K = 100 and 
1000, respectively. From Fig. El we see from P(£) that 
the localisation lengths for A-DNA are mostly distributed 
around 15-20, but P(£) has a rather long tail for large 
£. However, there are some windows where the localisa- 
tion lengths exceed even the size of the window K = 100. 
Thus at specific positions in the DNA sequence, the sys- 
tem appears essentially extended with £ > K. On the 
other hand, the distribution P(£) is identical when in- 
stead of A-DNA, we consider scrambled DNA. Therefore 
the presence of such regions is not unique to A-DNA. 
The results from windows positioned at the coding part 
of A-DNA appear statistically similar to the complete se- 
quence, i.e. including also the non-coding regions. This 
suggests that with respect to the localisation properties 
there is no obvious difference between A-DNA and scram- 
bled A-DNA as well as coding and non-coding regions. 
We emphasise that similar results have been obtained for 
a DNA sequence constructed from the SARS corona- viral 
data. 
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FIG. 15: Top: Variation of the localisation lengths for a 
sliding window of length K — 100 as a function of window 
starting position for A-DNA at E = 3. The black crosses 
(x) denote results for windows corresponding to the coding 
sequences of A-DNA only. The dashed horizontal line de- 
notes K. Middle: Same as in the top panel but with ran- 
domly scrambled A-DNA. Bottom: Normalised distribution 
functions P(£) for the localisation lengths £ of A- (black) and 
scrambled- A-DNA (grey). 



In Fig. 1151 we repeat these calculations but with K = 
1000. Clearly, P(£) is peaked again around 15-20 and 
this time has no tail. In all cases, K > £. Again, the 
results for scrambled DNA are different in each window, 
and now even P(£) is somewhat shifted with respect to 
A-DNA. 

Thus in conclusion, we do not see significant differences 
between A-DNA and its scrambled counter part. More- 
over, there appears to be no large difference between the 
localisation lengths measured in the coding and the non- 
coding sequences of bacteriophage A-DNA. This indicates 
that the average £ values computed in the previous sec- 
tions is sufficient when considering the electronic locali- 
sation properties of the 4 complete DNA sequences. 



B. Computing correlation functions 

As shown in the last section, the spatial variation of 
£ for a fixed window size is characteristic of the order 
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FIG. 16: Variation of the localisation lengths for a sliding 
window of length K = 1000 at E = 3 as in Fig. [T3 Middle: 
Same as in the top panel but with randomly scrambled A- 
DNA. Bottom: Normalised distribution functions for 
the localisation lengths £ of A- (black) and scrambled- A-DNA 
(grey). 



of bases in the DNA sequence. Thus we can now study 
how this biological information is retained at the level 
of localisation lengths. In order to do so, we define the 
correlation function 



Cor(fc) = 



EH~i k [i{n) ~ <£>] Un + k) - (0} 



<0] 



(5) 



where (£} = £™=i £( r ;)/ n i s £ averaged over all n = 
L — (K — 1) windows for each of which the individual 
localisation lengths are £(?"i). 

In Fig. El we show the results obtained for A-DNA 
with windows of length 10, 100 and 1000. We first note 
that Cor(fc) drops rapidly until the distance k exceeds the 
window width K (see the inset of Fig. I17|) . For k > K, 
Cor(fc) fluctuates typically between ±0.2 and there is a 
larger anti-correlation for base-pair separations of about 
k = 8000. We note that such large scale features are not 
present when considering scrambled A-DNA instead. 
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FIG. 17: Cor(fc) as defined in Eq. © for A-DNA and K = 10, 
100, and 1000 at E = 0. The inset shows the same date but 
plotted as a function of normalized seperation k/K. 



VIII. DISCUSSION 

The fishbone and ladder models studied in the present 
paper give qualitatively similar results, i.e. a gap in the 
DOS on the order of the hopping energies to the back- 
bone, extended states for periodic DNA sequences and 
localised states for any non-zero disorder strength. Thus 
at T = 0, our results suggest that DNA is an insula- 
tor unless perfectly ordered. Quantitatively, the localisa- 
tion lengths £ computed for the ladder model are larger 
than for the fishbone model. Since we are interested in 
these non-universal lengths, the ladder model is clearly 
the more appropriate model. 

The localisation lengths measure the spatial extent of 
a conducting electron. Our results suggest — in agree- 
ment with all previous considerations — that poly(dG)- 
poly(dC) DNA allows the largest values of £. Even af- 
ter adding a substantial amount of disorder, poly(dG)- 
poly(dC) DNA can still support localization lengths 
of a few hundred base-pair seperation lengths. With 
nanoscopic experiments currently probing at the most 
a few dozen bases, this suggests that poly(dG)-poly(dC) 
DNA will appear to be conducting in these experiments. 

Furthermore, telomcric DNA is a very encouraging and 
interesting naturally occuring sequence because it gives 
very large localisation lengths in the weakly disordered 
regime. Nevertheless, we find that all investigated, non- 
periodic DNA sequences such as, e.g. random- ATGC and 
A-DNA, give localised behaviour even in the clean state. 
This indicates that they are insulating at T = 0. 

When the effects of the environment, modelled by their 
potential changes on the backbone, are included, we find 
that the localisation lengths in the two bands decrease 
quickly upon increasing the disorder. Nevertheless, de- 
pending on the value of the Fermi energy, the resulting 
£ values can still be 10-20 base-pairs long. While this 
may not give metallic behavior, it can still result in a fi- 
nite current for small sequences. We also note that these 
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distances are quite close to those obtained from electron- 
transfer studies. 

The backbone disorder also leads to states moving into 
the gap. Therefore the environment prepared in the ex- 
periments determines the gap which is being measured. 
Furthermore, the localisation properties of the states in 
the former gap are drastically different from those in the 
2 bands. Increasing the disorder leads to an increase in 
the localization lengths and thus potentially larger cur- 
rents. This is most pronounced for binary disorder, taken 
to model the adhesion of cations in solution. Thus within 
the 2 models studied, we find that their transport prop- 
erties are in a very crucial way determined by the en- 
vironment. Differences in experimental set-up such as 
measurements in 2D surfaces or between elevated con- 



tacts are likely to lead to quite different results. 

As far as the correlations within biological A-DNA are 
concerned, we see only a negligible difference between 
the localisation properties of the coding and non-coding 
parts. However, this is clearly dependent on the chosen 
energy and the particular window lengths used. Investi- 
gations on other DNA sequences are in progress. 
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